Y

YouLibs

Remove Touch Overlay

Accelerating ML Inference at Scale with ONNX, Triton and Seldon | PyData Global 2021

Duration: 28:29Views: 352Likes: 3Date Created: Jan, 2022

Channel: PyData

Category: Science & Technology

Tags: pythonlearn to codeeducationsoftwarepydatalearncodinghow to programjuliaopensourcescientific programmingnumfocuspython 3tutorial

Description: Accelerating ML Inference at Scale with ONNX, Triton and Seldon Speaker: Alejandro Saucedo Summary Identifying the right tools for high performant production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this session showcase how practitioners can optimize & productionise ML models in scalable ecosystems without having to deal with the underlying infrastructure. We'll be optimizing a GPT-2 with ONNX and deploying to Triton using Seldon & Tempo. Description Identifying the right tools for high performant production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this session we aim to provide a hands-on guide on how practitioners can productionise optimized machine learning models in scalable ecosystems using production-ready open source tools & frameworks. We will dive into a practical use-case, deploying the renowned GPT-2 NLP machine learning model using the Tempo SDK, which allows data scientists to productionise ML models without having to deal with the complexity of the underlying infrastructure - abstracting the complexity of the underlying model servers and runtime (Docker and Kubernetes) environments & frameworks. We will showcase the foundational concepts and best practices to consider when leveraging production machine learning inference at scale. We will present some of the key challenges currently being faced in the MLOps space, as well as how each of the tools in the stack interoperate throughout the production machine learning lifecycle. Namely, we will introduce the benefits that the ONNX Open Standard and Runtime brings, as well as how we are able to leverage the optimized triton server and the orchestration framework Seldon Core to achieve a robust production machine learning deployment that can scale to your growing team / organisational needs. By the end of this talk, attendees will have a better understanding of how they will be able to leverage these tools for their own models, as well as for the broad range of pre-trained models available. We will also provide a broad range of links and resources that will allow attendees do dive deeper into detailed areas, such as observability, scalability, governance, etc. Alejandro Saucedo's Bio Alejandro Saucedo is the Director of Machine Learning Engineering at Seldon Technologies, where he leads teams of machine learning engineers focused on the scalability and extensibility of machine learning deployment and monitoring products with over 5 million installations. Alejandro is also the Chief Scientist at the Institute for Ethical AI & Machine Learning, where he leads the development of industry standards on machine learning explainability, adversarial robustness and differential privacy. With over 10 years of software development experience, Alejandro has held technical leadership positions across hyper-growth scale-ups and has a strong track record building cross-functional teams of software engineers. GitHub: github.com/axsaucedo LinkedIn: linkedin.com/in/axsaucedo Website: ethical.institute/ LInkedin: linkedin.com/in/axsaucedo Twitter: twitter.com/axsaucedo Github: github.com/axsaucedo Website: ethical.institute PyData Global 2021 Website: pydata.org/global2021 LinkedIn: linkedin.com/company/pydata-global Twitter: twitter.com/PyData pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details. Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: github.com/numfocus/YouTubeVideoTimestamps

Swipe Gestures On Overlay